Skip to content

Add necessary flags to dump optimized inlining info and fix marker placement in tinybench#83

Open
GuillaumeLagrange wants to merge 3 commits into
mainfrom
cod-2821-v8-maglevturbofan-inlining-loses-nested-inline-information
Open

Add necessary flags to dump optimized inlining info and fix marker placement in tinybench#83
GuillaumeLagrange wants to merge 3 commits into
mainfrom
cod-2821-v8-maglevturbofan-inlining-loses-nested-inline-information

Conversation

@GuillaumeLagrange

Copy link
Copy Markdown
Contributor

No description provided.

Add --perf-prof, --log-code, --no-log-source-code, --no-logfile-per-isolate
and --logfile to getV8Flags in walltime mode. --perf-prof writes the jitdump
samply uses to symbolicate JIT'd JS; --log-code writes the code log whose
inlining map lets the symbolicating recover TurboFan/Maglev inlined frames the
jitdump alone collapses. Source-code logging is left off to keep the log small.

Refs COD-2821, COD-2822
@GuillaumeLagrange GuillaumeLagrange force-pushed the cod-2821-v8-maglevturbofan-inlining-loses-nested-inline-information branch from 0b43ff4 to e07237e Compare June 18, 2026 15:26
@codspeed-hq

codspeed-hq Bot commented Jun 18, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 27.74%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 11 improved benchmarks
❌ 20 regressed benchmarks
✅ 194 untouched benchmarks
⏩ 1 skipped benchmark1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Memory fibo 30 19 B 65,624 B -99.97%
Memory fibo 15 5 B 128 B -96.09%
Memory test sync baz 100 1 B 21 B -95.24%
Memory test sync baz 100 1 B 21 B -95.24%
Simulation test_recursive_fibo_10 145.2 µs 419.3 µs -65.37%
Simulation switch 1 101.2 µs 158 µs -35.91%
Simulation one 402.7 µs 591.1 µs -31.88%
Memory wait 1ms 7 B 10 B -30%
Simulation switch 1 131.5 µs 187.5 µs -29.86%
Simulation wait 500ms 12.1 ms 17.1 ms -29.22%
Simulation switch 1 130 µs 183.1 µs -28.99%
Simulation switch 1 129.8 µs 182.7 µs -28.98%
Simulation wait 1sec 23 ms 31.9 ms -28.03%
Simulation test_recursive_fibo_10 146.6 µs 197.7 µs -25.85%
WallTime test_recursive_cached_fibo_30 3.4 µs 4 µs -15.62%
Simulation switch 2 11.2 µs 12.9 µs -13.28%
WallTime test_recursive_cached_fibo_20 2.7 µs 3 µs -11.86%
WallTime switch 1 276 ns 312 ns -11.54%
Simulation test sync baz 100 30.7 µs 34.5 µs -10.83%
Simulation test sync baz 100 30.6 µs 34.3 µs -10.67%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing cod-2821-v8-maglevturbofan-inlining-loses-nested-inline-information (5935e1a) with main (bcd7e64)

Open in CodSpeed

Footnotes

  1. 1 benchmark was skipped, so the baseline result was used instead. If it was deleted from the codebase, click here and archive it to remove it from the performance reports.

@greptile-apps

greptile-apps Bot commented Jun 18, 2026

Copy link
Copy Markdown

Greptile Summary

This PR moves the walltime instrument window from wrapping the entire task.run() call to being driven by tinybench's setup/teardown hooks, so that warmup iterations and post-run statistics computation (processRunResult) are excluded from the measurement. It also adds V8 flags (--perf-prof, optional --logfile) to emit inlining/JIT data for samply profiling, and skips the linux-perf path in walltime mode.

  • walltime.ts — new installInstrumentHooks() intercepts bench.opts.setup/teardown; openInstrumentWindow is called at the end of setup (mode \"run\") and closeInstrumentWindow at the start of teardown, preserving the user's original hooks in order.
  • introspection.tsgetV8Flags gains a walltime switch case that unconditionally adds --perf-prof and conditionally adds V8 log flags when CODSPEED_V8_LOG is set.
  • shared.tsopenInstrumentWindow / closeInstrumentWindow extracted from the two wrapWithInstrumentHooks* helpers, making them available to the new hook-based path.

Confidence Score: 5/5

Safe to merge; the hook-based measurement refactor is logically sound and the V8 flag additions are additive-only.

The core change — moving the instrument window into tinybench's setup/teardown hooks — is straightforward, well-commented, and the sequential-task assumption is valid for tinybench. The V8 flag additions are guarded by the CODSPEED_V8_LOG env var and degrade gracefully to a warning if flags are missing. Open questions from earlier threads (async setup ordering, type cast on bench.opts, missing break) are pre-existing and minor. The only new item introduced here is a pinned alpha runner version in CI, which is intentional for testing but carries no runtime risk.

No files require special attention beyond what was already noted in earlier review threads on packages/tinybench-plugin/src/walltime.ts.

Important Files Changed

Filename Overview
packages/tinybench-plugin/src/walltime.ts Adds installInstrumentHooks() that drives the measurement window from tinybench's setup/teardown lifecycle, correctly excluding warmup and stats computation. Removes direct wrapWithInstrumentHooks* calls; instrument window is now opened in the setup hook and closed in teardown. Pre-existing concerns (type cast for bench.opts, async setup ordering, non-null assertion on runStart) flagged in earlier review threads.
packages/core/src/introspection.ts Adds a walltime switch case that emits --perf-prof plus optional V8 log flags controlled by CODSPEED_V8_LOG. The walltime case is missing a break statement (flagged in an earlier review thread); harmless today but a latent risk if a case is added after it.
packages/core/src/index.ts Skips linuxPerf.start()/stop() in walltime mode; Node's native profiling flags are used instead. The guard uses getCodspeedRunnerMode() consistently and is straightforward.
packages/tinybench-plugin/src/shared.ts Extracts openInstrumentWindow() and closeInstrumentWindow() from the two wrapWithInstrumentHooks* methods, making them reusable from the new hook-based path in walltime.ts. Pure refactor, no behavioural change for existing callers.
.github/workflows/codspeed.yml Adds CODSPEED_WALLTIME_PROFILER: samply env var and pins runner-version: v4.17.7-alpha.1 for the walltime benchmark job. Using an alpha runner version in CI is intentional for testing the new profiling path, but could cause CI failures if the alpha release is removed.
CLAUDE.md Updates dev workflow documentation from moon to Turborepo commands. No code impact.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant TB as tinybench
    participant WBR as WalltimeBenchRunner
    participant IH as InstrumentHooks
    participant Task

    WBR->>TB: installInstrumentHooks()
    note over WBR: opts.setup / opts.teardown installed

    WBR->>TB: runTaskAsync(task, uri)
    TB->>IH: mongoMeasurement.start(uri)
    TB->>Task: task.run()
    activate Task
    note over Task: warmup phase (setup("warmup") / teardown("warmup") loops)
    Task->>WBR: opts.setup(task, "run")
    WBR->>IH: startBenchmark()
    WBR->>IH: currentTimestamp() → runStart
    Task->>Task: measured iterations
    Task->>WBR: opts.teardown(task, "run")
    WBR->>IH: currentTimestamp() → runEnd
    WBR->>IH: stopBenchmark()
    WBR->>IH: setExecutedBenchmark()
    WBR->>IH: addMarker(BENCHMARK_START, runStart)
    WBR->>IH: addMarker(BENCHMARK_END, runEnd)
    note over WBR: userTeardown(task, "run")
    Task->>Task: processRunResult (stats)
    deactivate Task
    TB->>IH: mongoMeasurement.stop(uri)
    WBR->>WBR: registerCodspeedBenchmarkFromTask(task)
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant TB as tinybench
    participant WBR as WalltimeBenchRunner
    participant IH as InstrumentHooks
    participant Task

    WBR->>TB: installInstrumentHooks()
    note over WBR: opts.setup / opts.teardown installed

    WBR->>TB: runTaskAsync(task, uri)
    TB->>IH: mongoMeasurement.start(uri)
    TB->>Task: task.run()
    activate Task
    note over Task: warmup phase (setup("warmup") / teardown("warmup") loops)
    Task->>WBR: opts.setup(task, "run")
    WBR->>IH: startBenchmark()
    WBR->>IH: currentTimestamp() → runStart
    Task->>Task: measured iterations
    Task->>WBR: opts.teardown(task, "run")
    WBR->>IH: currentTimestamp() → runEnd
    WBR->>IH: stopBenchmark()
    WBR->>IH: setExecutedBenchmark()
    WBR->>IH: addMarker(BENCHMARK_START, runStart)
    WBR->>IH: addMarker(BENCHMARK_END, runEnd)
    note over WBR: userTeardown(task, "run")
    Task->>Task: processRunResult (stats)
    deactivate Task
    TB->>IH: mongoMeasurement.stop(uri)
    WBR->>WBR: registerCodspeedBenchmarkFromTask(task)
Loading

Reviews (2): Last reviewed commit: "TO DROP: runner alpha w/ samply" | Re-trigger Greptile

Comment on lines +45 to +47
const opts = this.bench.opts as { setup: Hook; teardown: Hook };
const userSetup = opts.setup;
const userTeardown = opts.teardown;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 If a user doesn't pass setup/teardown to their Bench constructor, bench.opts.setup and bench.opts.teardown are typed as optional (setup?: Hook in BenchOptions). The comment asserts tinybench always resolves them to no-ops, but the explicit type cast as { setup: Hook; teardown: Hook } is a sign the compiler can't confirm this. If either value is undefined at the time installInstrumentHooks runs, calling userSetup(task, mode) or userTeardown(task, mode) will throw TypeError: X is not a function for every walltime benchmark run. Since the peerDependency spans all tinybench >=4.0.1 versions, this behaviour should not rely on an undocumented implementation detail. Nullish fallbacks cost nothing and eliminate the risk entirely.

Suggested change
const opts = this.bench.opts as { setup: Hook; teardown: Hook };
const userSetup = opts.setup;
const userTeardown = opts.teardown;
const opts = this.bench.opts as { setup?: Hook; teardown?: Hook };
const noop: Hook = () => {};
const userSetup = opts.setup ?? noop;
const userTeardown = opts.teardown ?? noop;

Comment on lines +49 to +55
opts.setup = (task, mode) => {
const setupResult = userSetup(task, mode);
if (mode === "run") {
this.runStart = this.openInstrumentWindow();
}
return setupResult;
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 If the user supplied an async setup hook, userSetup(task, mode) returns a Promise<void> that is captured in setupResult but not awaited before openInstrumentWindow() is called. The instrument window therefore opens while the async setup is still in flight, potentially attributing asynchronous prep time (e.g. data seeding, connection reset) to the measurement. Awaiting the user hook first keeps the marker placement accurate.

Suggested change
opts.setup = (task, mode) => {
const setupResult = userSetup(task, mode);
if (mode === "run") {
this.runStart = this.openInstrumentWindow();
}
return setupResult;
};
opts.setup = async (task, mode) => {
await userSetup(task, mode);
if (mode === "run") {
this.runStart = this.openInstrumentWindow();
}
};

Comment on lines +44 to 55
if (v8LogDir) {
flags.push(
...[
"--log-code",
"--no-log-source-code",
"--no-logfile-per-isolate",
`--logfile=${path.join(v8LogDir, V8_LOG_FILENAME_PATTERN)}`,
],
);
}
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The walltime case is missing a break statement, unlike the analysis case above it. While harmless here since there is no subsequent case, the inconsistency can confuse future readers and will trip no-fallthrough linters.

Suggested change
if (v8LogDir) {
flags.push(
...[
"--log-code",
"--no-log-source-code",
"--no-logfile-per-isolate",
`--logfile=${path.join(v8LogDir, V8_LOG_FILENAME_PATTERN)}`,
],
);
}
}
}
if (v8LogDir) {
flags.push(
...[
"--log-code",
"--no-log-source-code",
"--no-logfile-per-isolate",
`--logfile=${path.join(v8LogDir, V8_LOG_FILENAME_PATTERN)}`,
],
);
}
break;
}
}

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants